Window shaping functions for watermarking of multimedia signals

ABSTRACT

A window shaping function is described which has an integral over the window shaping function of zero. Compared with conventional window shaping functions, this window shaping function improves the robustness of the watermark signal for a given quality of host signal. Methods and apparatus suitable for utilizing this window shaping function within a watermarking scheme are described.

The present invention relates to window shaping functions, and the useof such functions in apparatus and methods for encoding and decodinginformation in multimedia signals, such as audio, video or data signals.

Watermarking of multimedia signals is a technique for the transmissionof additional data along with the multimedia signal. For instance,watermarking techniques can be used to embed copyright and copy controlinformation into audio signals.

The main requirement of a watermarking scheme is that it is notobservable (i.e. in the case of an audio signal, it is inaudible) whilstbeing robust to attacks to remove the watermark from the signal (e.g.removing the watermark will damage the signal). It will be appreciatedthat the robustness of a watermark will normally be a trade off againstthe quality of the signal in which the watermark is embedded. Forinstance, if a watermark is strongly embedded into an audio signal (andis thus difficult to remove) then it is likely that the quality of theaudio signal will be reduced.

Various types of audio watermarking schemes have been proposed, eachwith its own advantages and disadvantages. For instance, one type ofaudio watermarking scheme is to use temporal correlation techniques toembed the desired data (e.g. copyright information) into the audiosignal. This technique is effectively an echo-hiding algorithm, in whichthe strength of echo is determined by solving a quadratic equation. Thequadratic equation is generated by auto-correlation values at twopositions: one at delay equal to r, and one at delay equal to 0. At thedetector, the watermark is extracted by determining the ratio of theauto correlation function at the two delay positions.

WO 00/00969 describes an alternative technique for embedding or encodingauxiliary signals (such as copyright information) into a multimedia hostor cover signal. A replica of the cover signal, or a portion of thecover signal in a particular domain (time, frequency or space), isgenerated according to a stego key, which specifies modification valuesto the parameters of the cover signal. The replica signal is thenmodified by an auxiliary signal corresponding to the information to beembedded, and inserted back into the cover signal so as to form thestego signal.

At the decoder, in order to extract the original auxiliary data, areplica of the stego signal is generated in the same manner as thereplica of the original cover signal, and requires the use of the samestego key. The resulting replica is then correlated with the receivedstego signal so as to extract the auxiliary signal.

In such watermarking schemes the additional data to be embedded withinthe multimedia signal typically takes the form of a sequence of values.This sequence of values is then converted into a slowly varyingnarrow-band signal by applying a window shaping function to each value.To date, only bell shaped window shaping functions such as raised cosinefunctions (e.g. the Hanning window function shown in FIG. 1) have beenutilized.

It is an object of the present invention to provide an alternativewindow shaping function that allows improved performance over prior artwindow shaping functions.

In a first aspect, the present invention provides a method of generatinga watermark signal for embedding in a multimedia host signal, the methodcomprising the steps of: taking a first sequence of values; applying awindow shaping function to said sequence of values so as to form asmoothly varying signal suitable for embedding in the host signal;wherein the integral over the window shaping function is zero.

Preferably, said window shaping function has an anti-symmetric temporalbehavior.

Preferably, said window shaping function has a bi-phase behavior.

Preferably, the bi-phase window comprises at least two Hanning windowsof opposite polarities.

Preferably, the frequency spectrum of the smoothly varying signal has aDC component less than a component of any non-DC peak within thefrequency spectrum.

Preferably, each value of the first sequence is represented by a pulsetrain of width T_(s) so as to form a rectangular wave signal, the windowshaping function also being of width T_(s).

Preferably, said first sequence of values is convolved with the windowshaping function so as to form said smoothly varying signal.

Preferably, the method further comprises the step of embedding saidsmoothly varying signal into the host signal.

In a further aspect, the present invention provides an apparatusarranged to generate a watermark signal suitable for embedding in a hostmultimedia signal, the apparatus comprising:

-   -   a) a signal generator arranged to generate a watermark signal by        taking a first sequence of values; and    -   b) processing means arranged to apply a window shaping function        to said sequence of values so as to form a smoothly varying        signal suitable for embedding in a host signal; wherein the        integral over the window shaping function is zero.

Preferably, the apparatus further comprises a watermark embeddingapparatus that embeds said smoothly varying signal into the host signal.

In another aspect, the present invention provides a multimedia signalcomprising a watermark, wherein the original multimedia signal has beenwatermarked by a smoothly varying signal formed by applying a windowshaping function to a sequence of values, the integral over the windowshaping function being zero.

Preferably, the temporal envelope of the original signal has beenmodified by the watermark.

In a further aspect, the present invention provides a method ofdetecting a watermark signal embedded in a multimedia signal, the methodcomprising the steps of:

-   -   receiving a multimedia signal that may potentially be        watermarked by a watermark signal modifying the host multimedia        signal;    -   extracting an estimate of the watermark from said received        signal by assuming that the watermark comprises a sequence of        values to which a window shaping function has been applied, the        integral over the window shaping function being zero; and    -   processing the estimate of the watermark with a referenced        version of the watermark so as to determine whether the received        signal is watermarked.

Preferably, the method further comprises the step of applying a windowshaping function to said received signal, the integral over the windowshaping function being zero.

Preferably, the watermark signal has a payload, and the method furthercomprises the step of determining the payload of the watermark.

In another aspect, the present invention provides a watermark detectorapparatus arranged to detect whether a watermark signal is embeddedwithin a multimedia signal, the watermark detector comprising:

-   -   a receiver arranged to receive a multimedia signal that may        potentially be watermarked by a watermark signal modifying the        host multimedia signal;    -   an extractor arranged to extract an estimate of the watermark        from said received signal by assuming that the watermark        comprises a sequence of values to which a window shaping        function has been applied, the integral over the window shaping        function being zero; and    -   a processor arranged to process the estimate of the watermark        with a referenced version of the watermark so as to determine        whether the received signal is watermarked.

Preferably, the apparatus further comprises a unit arranged to apply awindow shaping function to the said received signal, wherein theintegral over the window shaping function is zero.

For a better understanding of the invention, and to show how embodimentsof the same may be carried into effect, reference will now be made, byway of example, to the accompanying diagrammatic drawings in which:

FIG. 1 illustrates a Hanning window shaping function, as utilized in theprior art;

FIG. 2 illustrates a bi-phase window shaping function according to apreferred embodiment of the present invention, in which the shapes ofthe two lobes are Hanning window functions;

FIG. 3 illustrates frequency spectra for a sequencew_(di)[k]={1,1,−1,1,−1,−1} conditioned with respectively a Hanningwindow shaping function and a bi-phase window shaping function;

FIG. 4 illustrates a sequence w_(i) formed by conditioning the sequencew_(di) with the bi-phase window shaping function shown in FIG. 2, and arunning integral of w_(i) (∫w_(i));

FIG. 5 illustrates a sequence w_(i) formed by conditioning the sequencew_(di) with the Hanning window shaping function, and a running integralof w_(i) (∫w_(i));

FIG. 6 is a diagram illustrating a watermark embedding apparatus inaccordance with an embodiment of the present invention;

FIG. 7 shows a signal portion extraction filter H used in one preferredembodiment;

FIGS. 8 a and 8 b show respectively the typical amplitude and phaseresponses of the filter H shown in FIG. 7 as functions of frequency;

FIG. 9 shows the payload embedding and watermark conditioning stage;

FIG. 10 is a diagram illustrating the details of one possibleimplementation of the watermark conditioning apparatus H_(c) of FIG. 9,including charts of the associated signals at each stage;

FIG. 11 is a diagram illustrating a watermark detector in accordancewith an embodiment of the present invention;

FIG. 12 diagrammatically shows the whitening filter H_(w) of FIG. 11,for use in conjunction with a bi-phase window shaping function;

FIG. 13 shows a typical shape of the correlation function output fromthe correlator of the watermark detector shown in FIG. 11; and

FIG. 14 illustrates a further window shaping function according to analternative embodiment of the present invention.

FIG. 2 shows a window shaping function as a function of time accordingto a preferred embodiment of the present invention. The integral overthe window shaping function is zero i.e. the total positive area of thefunction is equal to the total-negative area (such that the average areais zero). The window shaping function is a bi-phase function withanti-symmetric temporal behavior, where each lobe of the window functionis a Hanning window function.

Use of this window shaping function within watermarking schemes has beenshown to offer improved performance compared with the use of the Hanningwindow shaping function shown in FIG. 1.

FIG. 3 illustrates the frequency spectra corresponding to a watermarksequence (w_(di)[k]={1,1,−1,1,−1,−1}) conditioned with respectively aHanning and a bi-phase window shaping function. As can be seen, thefrequency spectrum for the Hanning window conditioned watermark sequencehas a maximum at frequency f=0, whilst the frequency spectrum for thebi-phase shaped watermark sequence has a minimum at f=0 i.e. it has verylittle DC component.

In many instances, useful information is contained in the non-DCcomponent of the watermark only. Consequently, for the same addedwatermark energy, a watermark conditioned with the bi-phase window willcarry more useful information than one conditioned by the Hanning windowshaping function. As a result, the bi-phase window offers superioraudibility performance for the same robustness, or conversely, it allowsa better robustness for the same audibility quality.

FIG. 4 illustrates a normalized integral (shown as a dotted line) forthe sequence w_(di) conditioned with the bi-phase window shapingfunction shown in FIG. 2. Conversely, FIG. 5 shows a normalized integralfor the same sequence conditioned with the Hanning window shapingfunction. It can be seen that the maximum value of the normalizedintegral is lower for the sequence conditioned by the bi-phase windowfunction as compared to that for the sequence conditioned by the Hanningwindow function.

Use of this window shaping function will now be described in conjunctionwith a watermarking scheme. However, it will of course be appreciatedthat the application of this window shaping function is not restrictedto the below scheme, but could be applied to other watermarkingtechniques, particularly time domain watermarking techniques. It canalso be used to carry secret keys (e.g. cryptographic keys) that can beused for the re-generation of reference random sequences at the detectorside, allowing the possibility of embedding different random sequencesin different host signals.

FIG. 6 shows a block diagram of the apparatus required to perform thedigital signal processing for embedding a multi-bit payload watermarkw_(c) into a host signal x in accordance with a preferred embodiment tothe present invention.

A host signal x is provided at an input 12 of the apparatus. The hostsignal x is passed in the direction of output 14 via the adder 22.However, a replica of the host signal x (input 8) is split off in thedirection of the multiplier 18, for carrying the watermark information.

The watermark signal w_(c) is obtained from the payload embedder andwatermark conditioning apparatus 6, and is derived from the watermarkrandom sequence w_(s), which is input to the payload embedder andwatermark conditioning apparatus. The multiplier 18 is utilized tocalculate the product of the watermark signal w_(c) and the replicaaudio signal x. The resulting product, w_(c)x is then passed via a gaincontroller 24 to the adder 22. The gain controller 24 is used to amplifyor attenuate the signal by a gain factor α.

The gain factor α controls the trade off between the audibility and therobustness of the watermark. It may be a constant, or variable in atleast one of time, frequency and space. The apparatus in FIG. 6 showsthat, when a is variable, it can be automatically adapted via a signalanalyzing unit 26 based upon the properties of the host signal x.Preferably, the gain α is automatically adapted, so as to minimize theimpact on the signal quality, according to a properly chosenperceptibility cost-function, such as a psycho-acoustic model of thehuman auditory system (HAS). Such a model is, for instance, described inthe paper by E. Zwicker, “Audio Engineering and Psychoacoustics:Matching signals to the final receiver, the Human Auditory System”,Journal of the Audio Engineering Society, Vol. 39, pp. Vol. 115-126,March 1991.

In the following, an audio watermark is utilized, by way of exampleonly, to describe this embodiment of the present invention.

The resulting watermark audio signal y is then obtained at the output 14of the embedding apparatus 10 by adding an appropriately scaled versionof the product of w_(c) and x to the host signal:y[n]=x[n]+αw[n]x[n].  (1)

Preferably, the watermark w_(c) is chosen such that when multiplied withx, it predominantly modifies the short time envelope of x.

FIG. 7 shows one preferred embodiment in which the input 8 to themultiplier 18 in FIG. 6 is obtained by filtering the replica of the hostsignal x using a filter H in the filtering unit 15. If the filter outputis denoted by x_(b), then according to this preferred embodiment, thewatermark signal is generated by adding the product of x_(b) and thewatermark w_(c) to the host signal x.

Let {overscore (x)}_(b) be defined such that {overscore(x)}_(b)=x−x_(b), and y_(b) be defined such that y=y_(b)+{overscore(x)}_(b), then the watermarked signal y can be written asy[n]=(1+w _(c) [n])x _(b) [n]+{overscore (x)} _(b) [n].  (2)and the envelope modulated portion y_(b) of the watermarked signal y isgiven asy _(b) [n]=(1+w _(c) [n])x _(b) [n]  (3)

Preferably, as shown in FIG. 8, the filter H is a linear phase band-passfilter characterized by its lower cut off frequency f_(L) and upper cutoff frequency f_(H). As can be seen in FIG. 8 b, the filter H has alinear phase response with respect to frequency f within the pass band(BW). Thus, when H is a band-pass filter, x_(b) and {overscore (x)}_(b)are the in-band and out-of-band components of the host signalrespectively. For optimum performance, it is preferable that the signalsx_(b) and {overscore (x)}_(b) are in phase. This is achieved byappropriately compensating for the phase distortion produced by filterH. In the case of a linear phase filter, the phase distortion is asimple delay.

In FIG. 9, the details of the payload embedder and watermarkconditioning unit 6 is shown. In this unit the watermark seed signalw_(s) is converted into a multi-bit watermark signal w_(c).

Firstly a finite length, preferably zero mean and uniformly distributedrandom sequence w_(s) is generated using a random number generator withan initial seed S. As will be appreciated later, it is preferable thatthis initial seed S is known to both the embedder and the detector, suchthat a copy of the watermark signal can be generated at the detector forcomparison purposes. This results in the sequence of length L_(w)w _(s) [k]ε[−1,1 ], for k=0,1,2, . . . ,L_(w)−1  (4)

Then the sequence w_(S) is circularly shifted by the amounts d₁ and d₂using the circularly shifting units 30 to obtain the random sequencesw_(d1) and w_(d2) respectively. It will be appreciated that these twosequences (w_(d1) and w_(d2)) are effectively a first sequence and asecond sequence, with the second sequence being circularly shifted withrespect to the first. Each sequence w_(di), i=1,2, is subsequentlymultiplied with a respective sign bit r_(i), in the multiplying unit 40,where r_(i)=+1 or −1, the respective values of r₁ and r₂ remainingconstant, and only changing when the payload of the watermark ischanged. Each sequence is then converted into a slowly varyingnarrow-band signal w_(i) of length L_(w)T_(s) by the watermarkconditioning circuit 20 shown in FIG. 9. Finally, the slowly varyingnarrow-band signals w, and w₂ are added with a relative delay T_(r)(where T_(r)<T_(s)) to give the multi-bit payload watermark signalw_(c). This is achieved by first delaying the signal w₂ by the amountT_(r) using delaying unit 45 and subsequently by adding it to w₁ withthe adding unit 50.

FIG. 10 shows one possible implementation of the watermark conditioningapparatus 20 used in the payload embedder and watermark conditioningapparatus 6 in more detail. The watermark random sequence w_(s) is inputto the conditioning apparatus 20.

For convenience, the modification of only one of the sequences w_(di) isshown in FIG. 10, but it will be appreciated that each of the sequencesis modified in a similar manner, with the results being added to obtainthe watermark signal w_(c).

As shown in FIG. 10, each watermark signal sequence w_(di)[k], i=1,2 isapplied to the input of a sample repeater 180. Chart 181 illustrates oneof the possible sequences w_(di) as a sequence of values of randomnumbers between +1 and −1, with the sequence being of length L_(w). Thesample repeater repeats each value within the watermark random sequenceT_(s) times, so as to generate a pulse train signal of rectangularshape. T, is referred to as the watermark symbol period and representsthe span of the watermark symbol in the audio signal. Chart 183 showsthe results of the signal illustrated in chart 181 once it has passedthrough the sample repeater 180.

The window shaping function s[n], which is a bi-phase function as shownin FIG. 2, is then applied to convert the rectangular pulse signalsderived from w_(d1) and w_(d2) into slowly varying signals w₁[n] andw₂[n] respectively. The window shaping function is of width T_(s).

The generated signals w₁[n] and w₂[n] are then added up with a relativedelay T_(r) (where T_(r)<T_(s)), to give the multi-bit payload watermarksignal w_(c)[n] i.e.w _(c) [n]=w _(i) [n]+w ₂ [n−T _(r)]  (5)

The value of T_(r) is chosen such that the zero crossings of w₁ matchthe maximum amplitude points of w₂ and vice-versa. Thus, for thisbi-phase window shaping function T_(r)=T_(s)/4. For other window shapingfunctions, other values of T_(r) are possible.

As will be appreciated by the below description, during detection thecorrelation of w_(c)[n] will generate two correlation peaks that areseparated by pL (as can be seen in FIG. 13). The value pL is part of thepayload, and is defined as $\begin{matrix}{{pL} = {{{d_{2} - d_{1}}}{{mod}\left( \left\lceil {L_{w}/2} \right\rceil \right)}}} & (6)\end{matrix}$In addition to pL, extra information can be encoded by changing therelative signs of the embedded watermarks. In the detector, this is seenas a relative sign r_(sign) between the correlation peaks. It will beseen that r_(sign) can take four possible values, and may be defined as:$\begin{matrix}{r_{sign} = {\frac{{2 \cdot \rho_{1}} + \rho_{2} + 3}{2}\quad \in \left\{ {0,1,2,3} \right\}}} & (7)\end{matrix}$where ρ₁=sign(cL₁) and ρ₂=sign(cL₂) are respectively estimates of thesign bits r₁ (input 80) and r₂ (input 90) of FIG. 9, and cL, and cL₂ arethe values of the correlation peak corresponding to w_(d1) and w_(d2)respectively. The overall watermark payload pL_(w), for an error freedetection, is then given as a combination of r_(sign) and pL:pL_(w)=<r_(sign), pL>.  (8)

The maximum information (I_(max)), in number of bits, that can becarried by a watermark sequence of length L_(w) is thus given by:$\begin{matrix}{I_{\max} = {{\log_{2}\left( {4 \cdot \left\lceil {L_{w}/2} \right\rceil} \right)}\quad{bits}}} & (9)\end{matrix}$

FIG. 11 shows a block diagram of a watermark detector (200, 300, 400).The detector consists of three major stages: (a) the watermark symbolextraction stage (200), (b) the buffering and interpolation stage (300),and (c) the correlation and decision stage (400).

In the symbol extraction stage (200), the received watermarked signaly′[n] is processed to generate multiple (N_(b)) estimates of thewatermarked sequence. These estimates of the watermark sequence arerequired to resolve a time offset that may exist between the embedderand the detector, so that the watermark detector can synchronize to thewatermark sequence inserted in the host signal.

In the buffering and interpolation stage (300), these estimates arede-multiplexed into N_(b) separate buffers, and an interpolation isapplied to each buffer to resolve possible timescale modifications thatmay have occurred e.g. a drift in sampling (clock) frequency may haveresulted in a stretch or shrink in the time domain signal (i.e. thewatermark may have been stretched or shrunk).

In the correlation and decision stage (400), the content of each bufferis correlated with the reference watermark and the maximum correlationpeaks are compared against a threshold to determine the likelihood ofwhether the watermark is indeed embedded within the received signaly′[n].

In order to maximize the accuracy of the watermark detection, thewatermark detection process is typically carried out over a length ofreceived signal y′[n] that is 3 to 4 times that of the watermarksequence length. Thus each watermark symbol to be detected can beconstructed by taking the average of several estimates of said symbol.This averaging process is referred to as smoothing, and the number oftimes the averaging is done is referred to as the smoothing factors_(f). Thus the detection window length L_(D) is the length of the audiosegment (in number of samples) over which a watermark detectiontruth-value is reported. Consequently, L_(D)=s_(f)L_(w)T_(s), whereT_(s) is the symbol period and L_(w) the number of symbols within thewatermark sequence. Typically, the length (L_(b)) of each buffer 320within the buffering and interpolation stage is L_(b)=s_(f)L_(w).

In the watermark symbol extraction stage 200 shown in FIG. 11, theincoming watermark signal y′[n] is input to the signal conditioningfilter H_(b)(210). This filter 210 is typically a band pass filter andhas the same behavior as the corresponding filter (H_(c), 20) in thewatermark embedder 10. The output of the filter H_(b) is y′_(b) [n], andassuming linearity within the transmission medium, it follows fromequations (1) and (3):y′ _(b) [n]≈i y _(b[) n]=(1+αw[n])x _(b) [n]  (10)

Note that in the above expression, the possible time offset between theembedder and the detector is implicitly ignored. For ease of explanationof the general watermarking scheme principles, from now on, it isassumed that there is perfect synchronism between the embedder and thedetector (i.e. no offset). It should be noted however that if there isnot perfect synchronism between the embedder and the detector, then thedeviation can be compensated for within the buffering and interpolationstage 300 utilizing techniques known to the skilled person e.g.iteratively searching through alternative shifts in scale and offsetuntil a best match is achieved.

Note that when no filter is used in the embedder (i.e., when H=1) thenH_(b) in the detector can also be omitted, or it can still be includedto improve the detection performance. If H_(b) is omitted, then y_(b) inequation (10) is replaced with y. The rest of the processing is thesame.

Assuming that the audio signal is divided into frames of length T_(s),and that y′_(b, m)[n] is the n-th sample of the m-th filtered framesignal, the energy E[m] corresponding to the m-th frame is thus:$\begin{matrix}{{E\lbrack m\rbrack} = {\sum\limits_{n = 0}^{T_{s} - 1}\quad{{{y_{b,m}^{\prime}\lbrack n\rbrack}{S\lbrack n\rbrack}}}^{2}}} & (11)\end{matrix}$where S[n] is the same window shaping function used in the watermarkconditioning circuit of FIG. 10. A person skilled in the art willappreciate that equation 11 represents a matched filter receiver, and isthe optimum receiver when the symbol period is perfectly synchronized.Not withstanding this fact, from now on, we set S[n]=1 in order tosimplify subsequent explanations.

Combining this with equation 10, it follows that: $\begin{matrix}{{{E\lbrack m\rbrack} \approx {\sum\limits_{n = 0}^{T_{s} - 1}{{y_{b,m}\lbrack n\rbrack}}^{2}}} = {\sum\limits_{n = 0}^{T_{s} - 1}{{\left( {1 + {\alpha\quad{w_{e}\lbrack m\rbrack}}} \right){x_{b,m}\lbrack n\rbrack}}}^{2}}} & (12)\end{matrix}$where w_(e)[m] is the m-th extracted watermark symbol and contains N_(b)time-multiplexed estimates of the embedded watermark sequences. Solvingfor w_(e)[m] in equation 12 and ignoring higher order terms of α, givesthe following approximation: $\begin{matrix}{{w_{e}\lbrack m\rbrack} \approx {\frac{1}{2\alpha}\left( {\frac{\sum\limits_{n = 0}^{T_{s} - 1}{{y_{b,m}\lbrack n\rbrack}}^{2}}{\sum\limits_{n = 0}^{T_{s} - 1}{{x_{b,m}\lbrack n\rbrack}}^{2}} - 1} \right)}} & (13)\end{matrix}$

In the watermark extraction stage 200 shown in FIG. 11, the outputy′_(b)[n] of the filter H_(b) is provided as an input to a frame divider220, which divides the audio signal into frames of length T_(s) i.e.into y′_(b,m)[n], with the energy calculating unit 230 then being usedto calculate the energy corresponding to each of the framed signals asper equation (11). The output of this energy calculation unit 230 isthen provided as an input to the whitening stage H_(w) (240) whichperforms the function shown in equation 13 so as to provide an outputw_(e)[m].

It will be realized that the denominator of equation 13 contains a termthat requires knowledge of the host signal x. As the signal x is notavailable to the detector, it means that in order to calculate w_(e)[m]then the denominator of equation 13 must be estimated.

Below is described how such an estimation can be achieved for thebi-phase window shaping function, but it will equally be appreciatedthat the teaching could be extended to other window shaping functions.

It will be seen by examination of the bi-phase window function shown inFIG. 2, that when the audio envelope is modulated with such a windowfunction, the first and the second halves of the frame are scaled inopposite directions. In the detector, this property is utilized toestimate the envelope energy of the host signal x.

Consequently, within the detector, the audio frame is first sub-dividedinto two halves. The energy functions corresponding to the first andsecond half frames are hence given by $\begin{matrix}{{E_{1}\lbrack m\rbrack} = {\sum\limits_{n = 0}^{{T_{s}/2} - 1}\quad{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}}} & (14)\end{matrix}$and $\begin{matrix}{{E_{2}\lbrack m\rbrack} = {\sum\limits_{n = {T_{s}/2}}^{T_{s} - 1}\quad{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}}} & (15)\end{matrix}$respectively. As the envelope of the original audio is modulated inopposite directions within the two sub-frames, the original audioenvelope can be approximated as the mean of E₁[m] and E₂ [m].

Further, the instantaneous modulation value can be taken as thedifference between these two functions. Thus, for the bi-phase windowfunction, the watermark w_(e)[m] can be approximated by: $\begin{matrix}{{w_{e}\lbrack m\rbrack} \approx {\frac{1}{2\alpha}\left( {\frac{{E_{1}\lbrack m\rbrack} - {E_{2}\lbrack m\rbrack}}{{E_{1}\lbrack m\rbrack} + {E_{2}\lbrack m\rbrack}} - 1} \right)}} & (16)\end{matrix}$

Consequently, the whitening filter H_(w) 240 for a bi-phase windowshaping function can be realized as shown in FIG. 10. Inputs 242 and 243respectively receive the energy functions of the first and second halveframes E₁[m] and E₂ [m]. Each energy function is then split up into two,and provided to adders 245 and 246 which respectively calculate E₁[m]−E₂[m], and E₁[m]+E₂ [m]. Both of these calculated functions are thenpassed to the calculating unit 248 which divides the value from adder245 by the value from 246 so as to calculate an estimate for thewatermark w_(e)[m], in accordance with equation 16.

This output we[m] is then passed to the buffering and interpolationstage 300, where the signal is de-multiplexed by the de-multiplexer 310,buffered in buffers 320 of length L_(b) so as to resolve any lack ofsynchronism between the embedder and the detector, and interpolatedwithin the interpolation unit 330 so as to compensate for a possibletime scale modification between the embedder and the detector. Suchcompensation can utilize known techniques, and hence is not described inany more detail within this specification.

As shown in FIG. 11, outputs (w_(D1), w_(D2), . . . W_(DNb)) from thebuffering stage are passed to the interpolation stage and, afterinterpolation, the outputs (w_(I1), w_(I2), . . . w_(INb)) of thisstage, which correspond to the different estimates of the correctlyre-scaled signal, are passed to the correlation and decision stage. Ifit is believed that no time scaling compensation is required, the values(w_(D1), w_(D2), . . . W_(DNb)) can be passed directly to thecorrelation and decision stage 400 i.e. the interpolation stage 330 canbe omitted from the apparatus.

The correlator 410 calculates the correlation of each estimate w_(Ij),j=1, . . . , N_(b) with respect to the reference watermark sequencew_(c)[k]. Each respective correlation output corresponding to eachestimate is then applied to the maximum detection unit 420 whichdetermines which two estimates provided the maximum correlation peakvalues, and these estimates are chosen as the ones that best fit thecircularly shifted versions w_(d1) and w_(d2) of the referencewatermark, and the correlation values for these estimate sequences arepassed to the threshold detector and payload extractor unit 430.

If the interpolation stage is omitted, alternatively the correlator 410calculates the correlation of each estimate w_(Dj), j=1, . . . , N_(b)with the reference watermark sequence w_(s)[k] and the results arepassed on for subsequent processing to the units 420 and 430 as outlinedin the above paragraph.

The payload extractor unit 430 may be utilized to extract the payload(e.g. information content) from the detected watermark signal. Once theunit has estimated the two correlation peaks cL₁ and cL₂ that exceed thedetection threshold, the distance pL between the peaks (as defined byequation (6)) is measured. Next, the signs ρ₁ and ρ₂ of the correlationpeaks are determined, and hence r_(sign) calculated from equation (7).The overall watermark payload may then be calculated using equation (8).

For instance, it can be seen in FIG. 13 that pL is the relative distancebetween the two peaks. Both peaks are positive i.e. ρ₁=+1, and ρ₂=+1.From equation (7), r_(sign)=3. Consequently, the payload pL_(w)=<3, pL>.

The reference watermark sequence w_(s) used within the detectorcorresponds to (a possibly circularly shifted version of) the originalwatermark sequence applied to the host signal. For instance, if thewatermark signal was calculated using a random number generator withseed S within the embedder, then equally the detector can calculate thesame random number sequence using the same random number generationalgorithm and the same initial seed so as to determine the watermarksignal. Alternatively, the watermark signal originally applied in theembedder and utilized by the detector as a reference could simply be anypredetermined sequence.

FIG. 13 shows a typical shape of a correlation function as output fromthe correlator 410. The horizontal scale shows the correlation delay (interms of the sequence bins). The vertical scale on the left-hand side(referred to as the confidence level cL) represents the value of thecorrelation peak normalized with respect to the standard deviation ofthe typically normally distributed correlation function.

As can be seen, the typical correlation is relatively flat with respectto cL, and centered about cL=0. However, the function contains twopeaks, which are separated by pL (see equation 6) and extend upwards tocL values that are above the detection threshold when a watermark ispresent.

A horizontal line (shown in the Fig. as being set at cL=8.7) representsthe detection threshold. The detection threshold value controls thefalse alarm rate.

Two kinds of false alarms exist: The false positive rate, defined as theprobability of detecting a watermark in non watermarked items, and thefalse negative rate, which is defined as the probability of notdetecting a watermark in watermarked items. Generally, the requirementof the false positive alarm is more stringent than that of the falsenegative. The right hand side scale on FIG. 11 illustrates theprobability of a false positive alarm p. As can be seen, in the exampleshown, the probability of a false positive p=10⁻¹² is equivalent to thethreshold cL=8.7, whilst p=10⁻⁸³ is equivalent to cL=20.

After each detection interval, the detector determines whether theoriginal watermark is present or whether it is not present, and on thisbasis output a “yes” or a “no” decision. If desired, to improve thisdecision making process, a number of detection windows may beconsidered. In such an instance, the false positive probability is acombination of the individual probabilities for each detection windowconsidered, dependent upon the desired criteria. For instance, it couldbe determined that if the correlation function has two peaks above athreshold of cL=7 on any two out of three detection intervals, then thewatermark is deemed to be present. Obviously, such detection criteriacan be altered depending upon the desired use of the watermark signaland to take into account factors such as the original quality of thehost signal and how badly the signal is likely to be corrupted duringnormal transmission.

It will be appreciated by the skilled person that variousimplementations not specifically described would be understood asfalling within the scope of the present invention.

For instance, whilst the implementation of a particular bi-phase windowshaping function has been described, and in particular a bi-phase windowshaping function in which each lobe is a Hanning function, it will beappreciated that the present invention is applicable to any windowshaping function falling within the scope of the appended claims. Theobserved reduction in the DC component of the frequency spectrum hasbeen determined to be related to having a window shaping function inwhich the integral over the function is zero i.e. the total positivearea is equal to the total negative area. Use of such a function reducesthe DC component of the frequency spectrum irrespective of the watermarksequence. As useful information is not carried within the DC component,but only within the non DC components of the signal, any reduction inthe DC component is desirable.

FIG. 14 shows an example of an alternative window shaping function thatwould still fall within the scope of the present invention. The functionhas four lobes. The lobes between adjacent zero-crossing points areHanning window functions. It will be appreciated that such windowshaping functions can be symmetric or anti symmetric.

Whilst only the functionality of the embedding and detecting apparatushas been described, it will be appreciated that the apparatus could berealized as a digital circuit, an analog circuit, a computer program, ora combination thereof.

Equally, whilst the above embodiment has been described with referenceto an audio signal, it will be appreciated that the present inventioncan be applied to other types of signal, for instance video and datasignals.

Within the specification it will be appreciated that the word“comprising” does not exclude other elements or steps, that “a” or “and”does exclude a plurality, and that a single processor or other unit mayfulfil the functions of several means recited in the claims.

1. A method of generating a watermark signal for embedding in amultimedia host signal, the method comprising the steps of: taking afirst sequence of values; applying a window shaping function to saidsequence of values so as to form a smoothly varying signal suitable forembedding in the host signal; wherein the integral over the windowshaping function is zero.
 2. A method as claimed in claim 1, wherein thewindow shaping function has an anti-symmetric temporal behavior.
 3. Amethod as claimed in claim 1, wherein the window shaping function has abi-phase behavior.
 4. A method as claimed in claim 3, wherein thebi-phase window comprises at least two Hanning windows of oppositepolarities.
 5. A method as claimed in claim 1, wherein the frequencyspectrum of the smoothly varying signal has a DC component less than acomponent of any non-DC peak within the frequency spectrum.
 6. A methodas claimed in claim 1, wherein each value of the first sequence isrepresented by a pulse train of width T_(s) so as to form a rectangularwave signal, the window shaping function also being of width T_(s).
 7. Amethod as claimed in claim 1, wherein said first sequence of values isconvolved with the window shaping function so as to form said smoothlyvarying signal.
 8. A method as claimed in claim 1, the method furthercomprising the step of embedding said smoothly varying signal into thehost signal.
 9. An apparatus arranged to generate a watermark signalsuitable for embedding in a host multimedia signal, the apparatuscomprising: a) a signal generator arranged to generate a watermarksignal by taking a first sequence of values; and b) processing meansarranged to apply a window shaping function to said sequence of valuesso as to form a smoothly varying signal suitable for embedding in a hostsignal; wherein the integral over the window shaping function is zero.10. An apparatus as claimed in claim 9, wherein the apparatus furthercomprises a watermark embedding apparatus that embeds said smoothlyvarying signal into the host signal.
 11. A multimedia signal comprisinga watermark, wherein the original multimedia signal has been watermarkedby a smoothly varying signal formed by applying a window shapingfunction to a sequence of values, the integral over the window shapingfunction being zero.
 12. A signal as claimed in claim 11, wherein thetemporal envelope of the original signal has been modified by thewatermark.
 13. A method of detecting a watermark signal embedded in amultimedia signal, the method comprising the steps of: (a) receiving amultimedia signal that may potentially be watermarked by a watermarksignal modifying the host multimedia signal; (b) extracting an estimateof the watermark from said received signal by assuming that thewatermark comprises a sequence of values to which a window shapingfunction has been applied, the integral over the window shaping functionbeing zero; and (c) processing the estimate of the watermark with areferenced version of the watermark so as to determine whether thereceived signal is watermarked.
 14. A method as claimed in claim 13, themethod further comprising the step of applying a window shaping functionto said received signal, the integral over the window shaping functionbeing zero.
 15. A method as claimed in claim 13, wherein the watermarksignal has a payload, and the method further comprises the step ofdetermining the payload of the watermark.
 16. A watermark detectorapparatus arranged to detect whether a watermark signal is embeddedwithin a multimedia signal, the watermark detector comprising: (a) areceiver arranged to receive a multimedia signal that may potentially bewatermarked by a watermark signal modifying the host multimedia signal;(b) an extractor arranged to extract an estimate of the watermark fromsaid received signal by assuming that the watermark comprises a sequenceof values to which a window shaping function has been applied, theintegral over the window shaping function being zero; and (c) aprocessor arranged to process the estimate of the watermark with areferenced version of the watermark so as to determine whether thereceived signal is watermarked.
 17. An apparatus as claimed in claim 16,wherein the apparatus further comprises a unit arranged to apply awindow shaping function to the said received signal, wherein theintegral over the window shaping function is zero.